Do Your Worst to Make the Best: Paradoxical Effects in PageRank
نویسندگان
چکیده
Deciding which kind of visit accumulates high-quality pages more quickly is one of the most often debated issue in the design of web crawlers. It is known that breadth-first visits work well, as they tend to discover pages with high PageRank early on in the crawl. Indeed, this visit order is much better than depth first, which is in turn even worse than a random visit; nevertheless, breadth-first can be superseded using an omniscient visit that chooses, at every step, the node of highest PageRank in the frontier. This paper discusses a related, and previously overlooked, measure of effectivity for crawl strategies: whether the graph obtained after a partial visit is in some sense representative of the underlying web graph as far as the computation of PageRank is concerned. More precisely, we are interested in determining how rapidly the computation of PageRank over the visited subgraph yields relative ranks that agree with the ones the nodes have in the complete graph; ranks are compared using Kendall’s τ . We describe a number of large-scale experiments that show the following paradoxical effect: visits that gather PageRank more quickly (e.g., highest-quality-first) are also those that tend to miscalculate PageRank. Finally, we perform the same kind of experimental analysis on some synthetic random graphs, generated using well-known web-graph models: the results are almost opposite to those obtained on real web graphs.
منابع مشابه
مدیر موفق کیست؟
Who is a really successful manager? A manager who spends less money, or the one who earns more? A manager who can survive for a longer period of time, or an administrator who expands his organization, and opens up new branches? Which one is the most successful? The article tries to answer these questions and provides, some simple guidlines for the managers in every domain of management who wan...
متن کامل[Best things in the worst times].
The best things in the worst times that we provide for you will be ultimate to give preference. This reading book is your chosen book to accompany you when in your free time, in your lonely. This kind of book can help you to heal the lonely and get or add the inspirations to be more inoperative. Yeah, book as the widow of the world can be very inspiring manners. As here, this book is also creat...
متن کاملApplication of PageRank Model for Olympic Women’s Taekwondo Rankings: Comparison of PageRank and Accumulated Point Index System
Background. Although the World Taekwondo federation currently applies the APIS ranking method to calculate the Olympic rankings, some limitations exist. Objectives. This study applies the PageRank model to Olympics Taekwondo rankings. Methods. The 2015-2018 World Taekwondo Grand Prix competition results for women’s four weight classes (-49kg, -57kg, -67kg, +67kg) were used as research data, t...
متن کاملEvaluation of Social Media Platforms Using Best-Worst Method and Fuzzy VIKOR Methods: A Case Study of Travel Agency
A correct social media strategy is essential for travel agencies working in today's global market to reach customers. The travel industry is a service-oriented industry, and travel agencies can easily reach their customers on social media by transforming their marketing strategies at no extra costs. There are so many options that a travel agency can use to make itself more visible on social med...
متن کاملA Novel Approach to Feature Selection Using PageRank algorithm for Web Page Classification
In this paper, a novel filter-based approach is proposed using the PageRank algorithm to select the optimal subset of features as well as to compute their weights for web page classification. To evaluate the proposed approach multiple experiments are performed using accuracy score as the main criterion on four different datasets, namely WebKB, Reuters-R8, Reuters-R52, and 20NewsGroups. By analy...
متن کامل